Text-Independent Speaker Verification via State Alignment
نویسندگان
چکیده
To model the speech utterance at a finer granularity, this paper presents a novel state-alignment based supervector modeling method for text-independent speaker verification, which takes advantage of state-alignment method used in hidden Markov model (HMM) based acoustic modeling in speech recognition. By this way, the proposed modeling method can convert a text-independent speaker verification problem to a state-dependent one. Firstly, phoneme HMMs are trained. Then the clustered state Gaussian Mixture Models (GMM) is data-driven trained by the states of all phoneme HMMs. Next, the given speech utterance is modeled to sub-GMM supervectors in state level and be further aligned to be a final supervector. Besides, considering the duration differences between states, a weighting method is also proposed for kernel based support vector machine (SVM) classification. Experimental results in SRE 2008 core-core dataset show that the proposed methods outperform the traditional GMM supervector modeling followed by SVM (GSV-SVM), yielding relative 8.4% and 5.9% improvements of EER and minDCF, respectively.
منابع مشابه
DNN i-Vector Speaker Verification with Short, Text-Constrained Test Utterances
We investigate how to improve the performance of DNN ivector based speaker verification for short, text-constrained test utterances, e.g. connected digit strings. A text-constrained verification, due to its smaller, limited vocabulary, can deliver better performance than a text-independent one for a short utterance. We study the problem with “phonetically aware” Deep Neural Net (DNN) in its cap...
متن کاملInvestigation of Frame Alignments for GMM-based Text-prompted Speaker Verification
The frame alignment acts as an important role in GMM-based speaker verification. In text-prompted speaker verification, it is common practice to use the transcriptions to align speech frames to phonetic units. In this paper, we compare the performance of alignments from hidden Markov model (HMM) and deep neural network (DNN), using the same training data and phonetic units. We incorporate a pho...
متن کاملDeep Neural Networks and Hidden Markov Models in i-vector-based Text-Dependent Speaker Verification
Techniques making use of Deep Neural Networks (DNN) have recently been seen to bring large improvements in textindependent speaker recognition. In this paper, we verify that the DNN based methods result in excellent performances in the context of text-dependent speaker verification as well. We build our system on the previously introduced HMM based ivector approach, where phone models are used ...
متن کاملFactor analysis based channel compensation in speaker verification
This report describes a powerful channel compensation method for the text-independent speaker verification task. This powerful method is developed in the LRDE Speaker Verification framework. The purpose of a text-independent speaker verification system is to check whether a hypothesised speaker is really the author of a speech utterance. The channel compensation problem arises when training dat...
متن کاملIntegrating time-alignment information into the decision making for text-dependent HMM-based speaker verification
This paper proposes an integration of the time-alignment information in the decision making for HMM-based text-dependent speaker verification. The principle is to consider acoustical score and time-alignment as joint observations for which a log-likelihoodratio is computed and compared to a threshold. It is shown that such integration has two distincts aspects, one being a kind of adaptation of...
متن کامل